Deformable registration of two-dimensional/three-dimensional (2D/3D) images of abdominal organs is a complicated task because the abdominal organs deform significantly and their contours are not detected in two-dimensional X-ray images. We propose a supervised deep learning framework that achieves 2D/3D deformable image registration between 3D volumes and single-viewpoint 2D projected images. The proposed method learns the translation from the target 2D projection images and the initial 3D volume to 3D displacement fields. In experiments, we registered 3D-computed tomography (CT) volumes to digitally reconstructed radiographs generated from abdominal 4D-CT volumes. For validation, we used 4D-CT volumes of 35 cases and confirmed that the 3D-CT volumes reflecting the nonlinear and local respiratory organ displacement were reconstructed. The proposed method demonstrate the compatible performance to the conventional methods with a dice similarity coefficient of 91.6 \% for the liver region and 85.9 \% for the stomach region, while estimating a significantly more accurate CT values.
translated by 谷歌翻译
基于治疗期间的单投影图像的器官形状重建具有广泛的临床范围,例如在图像引导放射治疗和手术指导中。我们提出了一种图形卷积网络,该网络实现了用于单视点2D投影图像的3D器官网格的可变形登记。该框架使得能够同时训练两种类型的变换:从2D投影图像到位移图,以及从采样的每周顶点特征到满足网格结构的几何约束的3D位移。假设申请放射治疗,验证了2D / 3D可变形的登记性能,用于尚未瞄准迄今为止,即肝脏,胃,十二指肠和肾脏以及胰腺癌的多个腹部器官。实验结果表明,考虑多个器官之间的关系的形状预测可用于预测临床上可接受的准确性的数字重建射线照片的呼吸运动和变形。
translated by 谷歌翻译
Optimal transport (OT) has become a widely used tool in the machine learning field to measure the discrepancy between probability distributions. For instance, OT is a popular loss function that quantifies the discrepancy between an empirical distribution and a parametric model. Recently, an entropic penalty term and the celebrated Sinkhorn algorithm have been commonly used to approximate the original OT in a computationally efficient way. However, since the Sinkhorn algorithm runs a projection associated with the Kullback-Leibler divergence, it is often vulnerable to outliers. To overcome this problem, we propose regularizing OT with the \beta-potential term associated with the so-called $\beta$-divergence, which was developed in robust statistics. Our theoretical analysis reveals that the $\beta$-potential can prevent the mass from being transported to outliers. We experimentally demonstrate that the transport matrix computed with our algorithm helps estimate a probability distribution robustly even in the presence of outliers. In addition, our proposed method can successfully detect outliers from a contaminated dataset
translated by 谷歌翻译
Classification bandits are multi-armed bandit problems whose task is to classify a given set of arms into either positive or negative class depending on whether the rate of the arms with the expected reward of at least h is not less than w for given thresholds h and w. We study a special classification bandit problem in which arms correspond to points x in d-dimensional real space with expected rewards f(x) which are generated according to a Gaussian process prior. We develop a framework algorithm for the problem using various arm selection policies and propose policies called FCB and FTSV. We show a smaller sample complexity upper bound for FCB than that for the existing algorithm of the level set estimation, in which whether f(x) is at least h or not must be decided for every arm's x. Arm selection policies depending on an estimated rate of arms with rewards of at least h are also proposed and shown to improve empirical sample complexity. According to our experimental results, the rate-estimation versions of FCB and FTSV, together with that of the popular active learning policy that selects the point with the maximum variance, outperform other policies for synthetic functions, and the version of FTSV is also the best performer for our real-world dataset.
translated by 谷歌翻译
We study the problem of sharing as many branching conditions of a given forest classifier or regressor as possible while keeping classification performance. As a constraint for preventing from accuracy degradation, we first consider the one that the decision paths of all the given feature vectors must not change. For a branching condition that a value of a certain feature is at most a given threshold, the set of values satisfying such constraint can be represented as an interval. Thus, the problem is reduced to the problem of finding the minimum set intersecting all the constraint-satisfying intervals for each set of branching conditions on the same feature. We propose an algorithm for the original problem using an algorithm solving this problem efficiently. The constraint is relaxed later to promote further sharing of branching conditions by allowing decision path change of a certain ratio of the given feature vectors or allowing a certain number of non-intersected constraint-satisfying intervals. We also extended our algorithm for both the relaxations. The effectiveness of our method is demonstrated through comprehensive experiments using 21 datasets (13 classification and 8 regression datasets in UCI machine learning repository) and 4 classifiers/regressors (random forest, extremely randomized trees, AdaBoost and gradient boosting).
translated by 谷歌翻译
This study proposes novel control methods that lower impact force by preemptive movement and smoothly transition to conventional contact impedance control. These suggested techniques are for force control-based robots and position/velocity control-based robots, respectively. Strong impact forces have a negative influence on multiple robotic tasks. Recently, preemptive impact reduction techniques that expand conventional contact impedance control by using proximity sensors have been examined. However, a seamless transition from impact reduction to contact impedance control has not yet been accomplished. The proposed methods utilize a serial combined impedance control framework to solve this problem. The preemptive impact reduction feature can be added to the already implemented impedance controller because the parameter design is divided into impact reduction and contact impedance control. There is no undesirable contact force during the transition. Furthermore, even though the preemptive impact reduction employs a crude optical proximity sensor, the influence of reflectance is minimized using a virtual viscous force. Analyses and real-world experiments confirm these benefits.
translated by 谷歌翻译
Slimmable Neural Networks (S-Net) is a novel network which enabled to select one of the predefined proportions of channels (sub-network) dynamically depending on the current computational resource availability. The accuracy of each sub-network on S-Net, however, is inferior to that of individually trained networks of the same size due to its difficulty of simultaneous optimization on different sub-networks. In this paper, we propose Slimmable Pruned Neural Networks (SP-Net), which has sub-network structures learned by pruning instead of adopting structures with the same proportion of channels in each layer (width multiplier) like S-Net, and we also propose new pruning procedures: multi-base pruning instead of one-shot or iterative pruning to realize high accuracy and huge training time saving. We also introduced slimmable channel sorting (scs) to achieve calculation as fast as S-Net and zero padding match (zpm) pruning to prune residual structure in more efficient way. SP-Net can be combined with any kind of channel pruning methods and does not require any complicated processing or time-consuming architecture search like NAS models. Compared with each sub-network of the same FLOPs on S-Net, SP-Net improves accuracy by 1.2-1.5% for ResNet-50, 0.9-4.4% for VGGNet, 1.3-2.7% for MobileNetV1, 1.4-3.1% for MobileNetV2 on ImageNet. Furthermore, our methods outperform other SOTA pruning methods and are on par with various NAS models according to our experimental results on ImageNet. The code is available at https://github.com/hideakikuratsu/SP-Net.
translated by 谷歌翻译
We present a lightweight post-processing method to refine the semantic segmentation results of point cloud sequences. Most existing methods usually segment frame by frame and encounter the inherent ambiguity of the problem: based on a measurement in a single frame, labels are sometimes difficult to predict even for humans. To remedy this problem, we propose to explicitly train a network to refine these results predicted by an existing segmentation method. The network, which we call the P2Net, learns the consistency constraints between coincident points from consecutive frames after registration. We evaluate the proposed post-processing method both qualitatively and quantitatively on the SemanticKITTI dataset that consists of real outdoor scenes. The effectiveness of the proposed method is validated by comparing the results predicted by two representative networks with and without the refinement by the post-processing network. Specifically, qualitative visualization validates the key idea that labels of the points that are difficult to predict can be corrected with P2Net. Quantitatively, overall mIoU is improved from 10.5% to 11.7% for PointNet [1] and from 10.8% to 15.9% for PointNet++ [2].
translated by 谷歌翻译
We construct a corpus of Japanese a cappella vocal ensembles (jaCappella corpus) for vocal ensemble separation and synthesis. It consists of 35 copyright-cleared vocal ensemble songs and their audio recordings of individual voice parts. These songs were arranged from out-of-copyright Japanese children's songs and have six voice parts (lead vocal, soprano, alto, tenor, bass, and vocal percussion). They are divided into seven subsets, each of which features typical characteristics of a music genre such as jazz and enka. The variety in genre and voice part match vocal ensembles recently widespread in social media services such as YouTube, although the main targets of conventional vocal ensemble datasets are choral singing made up of soprano, alto, tenor, and bass. Experimental evaluation demonstrates that our corpus is a challenging resource for vocal ensemble separation. Our corpus is available on our project page (https://tomohikonakamura.github.io/jaCappella_corpus/).
translated by 谷歌翻译
Although sketch-to-photo retrieval has a wide range of applications, it is costly to obtain paired and rich-labeled ground truth. Differently, photo retrieval data is easier to acquire. Therefore, previous works pre-train their models on rich-labeled photo retrieval data (i.e., source domain) and then fine-tune them on the limited-labeled sketch-to-photo retrieval data (i.e., target domain). However, without co-training source and target data, source domain knowledge might be forgotten during the fine-tuning process, while simply co-training them may cause negative transfer due to domain gaps. Moreover, identity label spaces of source data and target data are generally disjoint and therefore conventional category-level Domain Adaptation (DA) is not directly applicable. To address these issues, we propose an Instance-level Heterogeneous Domain Adaptation (IHDA) framework. We apply the fine-tuning strategy for identity label learning, aiming to transfer the instance-level knowledge in an inductive transfer manner. Meanwhile, labeled attributes from the source data are selected to form a shared label space for source and target domains. Guided by shared attributes, DA is utilized to bridge cross-dataset domain gaps and heterogeneous domain gaps, which transfers instance-level knowledge in a transductive transfer manner. Experiments show that our method has set a new state of the art on three sketch-to-photo image retrieval benchmarks without extra annotations, which opens the door to train more effective models on limited-labeled heterogeneous image retrieval tasks. Related codes are available at https://github.com/fandulu/IHDA.
translated by 谷歌翻译